Language Syntax in a Text Recognition Algorithm

نویسنده

Jonathan J. Hull

چکیده

A Markov model for language syntax and its use in a text recognition algorithm is proposed. Syntactic constraints are described by the transition probabilities between classes. The confusion between the feature string for a word and the syntactic classes is also described probabilistically. A modification of the Viterbi algorithm is also proposed that finds a fixed number of sequences of syntactic classes for a given sentence that have the highest probabilities of occurrence, given the feature strings for the words. An experimental application of this approach is demonstrated with a word hypothesization algorithm that produces a number of guesses about the identity of each word in a running text. It is shown that the Viterbi algorithm can significantly reduce the number of words that can possibly match an image.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Incorporation of a Mark~v Model of Language Syntax in a Text Recognition Algorithm

A Markov model for language syntax an~ its use in a text recognition algorithm is proposed. Syntactic constraints are described by the transition probabilities between classes. The confusion between the feature string for a word and the syritactic classes is also described probabilistic ally. A modification of the Viterbi algorithm is also pr0posed that finds a fixed number of sequences of synt...

متن کامل

A Hidden Markov Model for Language Syntax in Text Recognition

The use of a hidden Markov model (HMM) for language syntax to improve the performance of a text recognition algorithm is proposed. Syntactic constraints are described by the transition probabilities between word classes. The confusion between the feature string for a word and the various syntactic classes is also described probabilistically. A modification of the Viterbi algorithm is also propo...

متن کامل

A new model for persian multi-part words edition based on statistical machine translation

Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...

متن کامل

Detection and Recognition of Multi-language Traffic Sign Context by Intelligent Driver Assistance Systems

Design of a new intelligent driver assistance system based on traffic sign detection with Persian context is concerned in this paper. The primary aim of this system is to increase the precision of drivers in choosing their path with regard to traffic signs. To achieve this goal, a new framework that implements fuzzy logic was used to detect traffic signs in videos captured along a highway f...

متن کامل

بهبود شناسایی موجودیت‌های نامدار فارسی با استفاده از کسره اضافه

Named entity recognition is a process in which the people’s names, name of places (cities, countries, seas, etc.) and organizations (public and private companies, international institutions, etc.), date, currency and percentages in a text are identified. Named entity recognition plays an important role in many NLP tasks such as semantic role labeling, question answering, summarization, machine ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1992

Language Syntax in a Text Recognition Algorithm

نویسنده

چکیده

منابع مشابه

Incorporation of a Mark~v Model of Language Syntax in a Text Recognition Algorithm

A Hidden Markov Model for Language Syntax in Text Recognition

A new model for persian multi-part words edition based on statistical machine translation

Detection and Recognition of Multi-language Traffic Sign Context by Intelligent Driver Assistance Systems

بهبود شناسایی موجودیت‌های نامدار فارسی با استفاده از کسره اضافه

عنوان ژورنال:

اشتراک گذاری